DOCUMENT RESUME 



ED 242 772 TM 840 189 



AUTHOR Marsh, Herbert W. 

TITLE The Bias of Negatively Worded Items in Ratings Scales 

for Preadolescent Children: A Cognitive-Developmental 
Phenomenon -. 

PUB_DATE 20 Feb 84 

NOTE_ 28p. 

PUB TYPE Reports - Research/Technical (143) 



EDRS_ PRICE _ MF01/PC02 Plus Postage. 

DESCRIPTORS *Cbgriitive Development ; Correlation ; Elementary 

Education ; Factor Structure ;_*Negative_ Forms 
(Language) ; Preadolescents ; Rating Scales; Reading 



Achievement; *Rf ad i rig Difficulties; *Sel£ Concept 
Measures; Test Bias; *Test Construction; *Test 
I terns 



ABSTRACT 

Negative item bias is produced by the inability of 

preadolescent children to respond appropriately to negatively worded 
items on rating scales, and is hypothesized to be a 
cognitive-developmental phenomenon . The effect is examined with 
responses to the Self Description-Questionnaire (SDQ) Z a multif actor 
self-concept instrument^ In study i, response to positive and 
negative items were uncorrelated in grade 2 but were substantially 
correlated by grade 5. in study 2, confirmatory factor analysis of 
response by grade 5 students demonstrated that the negative items 
contributed both to the scale they were designed to measure and to a 
"negative item" factor. ^The negative item factor was nearly 
uncorrelated with any of the self -concept factors but was 
substantially correlated with reading achievement. The two studies 
demonstrate that younger children arid children with poorer reading 
skills are less able to respond appropriately to negatively-worded 
items, and that this effect produces a bias in their response to the 
SDQ; This supports the contention that the effect is a 
cognitive-developmental phenomenon. (Author/PN) 



************************************* 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document . * 

*********************************************************************** 



ERLC 



The Bid* u+ N^dlively Worded TU.„ S Ir, R«LLir. y * Sl.1„ Fur 
Pr..dul^c.ht Children: A Cuyn 1 L i ve-Deve 1 upmeriUl P^r.u.u^r.ur, 



UJ 



Uni verb! Ly u-f Sydney, fiU^Lr^li, 



20 Februdf-y, 1984 



U^._0£PABTMENrOE EDUCATION 
NATIONAL INSTITUTE OF EDUCATION. 
EDUCATIONAL RESOURCES, INFORMATION 

CENTER (ERIC) 
V£ This document has bean repioduced as 
leceived front the person or of f)a nidation 
Originating it 

Minor changes have been m<idi! to improve 
reproduction quality. 

• Points of view or opinions stated in this dpcu 
monl do not necessarily represent official NlE 
position or policy 



Ruriri i ri9 Htjoid : Ne^dL i ve I leluta 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



X 

£ 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)." 



ERIC 



The Bids of Neydtively Worded I teiiis Iii Ratings Scales For 
Pr«-dul ei LenL Children: A Coyrl 1 Li«e-fie»el opiiieri Ldl Phenomenon 

ABSTRACT 

The, n e ycLl«e i ten, bi^ U pr oduced by the inability of ur eaduiebLeiiL 
children Lo respond app r ou r i d 1 y Lo ntya Live! v worded i Leins br, ,.Li„ y 
5C5l ^» M ' ia * £ ^uuLl^lzed to be a CQyhiLive-UevSlupiiiefrL.il 
ci.e«.Ui.,enun. The effecL la eAaiiiihed wi Lh responses Lo Lhe Self 
rescriptiuh Questionnaire (SDQ) , a iiiu 1 L i f dc Lor- bt 1 f -t_urictu L 1 ns Lraitleri L 
..h^sc- fdcLof 5 Li-ucLur«, -elicbilU/, dnd vul idi L>> r,d„e beer. Clearly 
demonstrated in numerous oLher sLadiss. Iri bLudy 1, rt ^ ufl;3t;b L(J 
positively ctr.d neyalivel^ horded i Le.i.s were Lowered -for- children ( r. = 
653) ir. y.cdeb 2-5. Per Liculdrly i r. yr-de 2, children frequer.Lly 
responded "true" Lo r.eydLive iLe.i.s, i rid i t_ci L i riy d very poor- se 1 f - 
uJ " L ^ 1 ' twt " Lheir oLher- r espar.ses ir.diCdLed d ver y posi Live 

^elf-cor.cepL. Rebpofibes Lo poSiLive dnd neydLive i Leiiis were 
ar.C^rr el~Led (-0.02) i r. yrdde 2, buL were buUb Ldh L i dl 1 y cOrreldLed by 
yr-de 5 (O.oO) . Ir. sLudy 2 cor.f i rfii Lory fduLor- anal^Sb of response* 
by yec 5 bLudSnLb C r. - 559) demon* L c 5 Led LhdL Lhe r.eydLive i Le..,b 
cor.LrlbuLed boLh Lo Lhe scale Lhey were desiyned Lo measure arid Lb d 
"r.eydLive iLeii." f dCLor . The rieysLive i Leu. f dCLor w=is nearly 
uf.LurreiaLed wiLh driy of Lhe se 1 f -cor.Cep L facLors, buL i,db 

buLbLanLicl l y cor r eldLed with read i riy ach i even. en L (0.42) . Taker. ; 
together; Lhe two bLudiea deii.br. s t r-d te LhdL your.yer children dr.d 
children wi Lh poorer- r eddir.y skills are less able Lo respond 
d P pr opr idLely Lo neyaLi vely worded items ar.d LhdL this effect produces 
d bids ir. their- respor.se to the SDS. This supports the cor. ten t i or. 
U.cL the effect is d coyhi ti ve-develooiiientdl p heriOiileriori ; 



ERIC 



_Th*.' -Bl t±ta Of Mtydtlvel/ WutdtU I teiiib I ri RattrigS Seal eta Fur 
PfedUulebLen L Children: A Cogn 1 L i ve-De ve 1 op men Lai Pheridmehuri 

Test COriS L r'uC L 1 On Spec 1 al 1 s ts ar-^ue for the ube uf **oiiie neydtively 
worded Items an persona 1 I ty , attitude dnd other -rating seal e 
1 riS t ruiiieri ts In order to disrupt response Sets Such as respond 1 ri y to 
ai i items with the same response category. This procedure i ^ 
par* t i Cu 1 ar 1 y USef u 1 fUr Single-Scale irist ruiiieri Ls where ail it eiiiS are 
deb i y ned to measure the same cl ,s t r uc t . For iiiu i Li scale 1 ris t rumen t s 
the practice seems less useful, arid the confirmation of the Scaled 
LhrOuyh pruced ureS Such as factor" analysis provides a t eta t f Of Such a 
response set; The use of riega t ivel/ worded I terns assume* that t h*y 
measure the same construct as positively worded items. However- - f thita 
assumption is rarely tested and its validity seems questionable when 
respondents are preadol escen t Children- In order to respond 
appropriately to negatively worded items, respondents often have to 
invoke a double negative logic that requires a higher level of verbal 
reasoning than do positively worded items 7 . For- example, the i teiii "I 
am WOT good student" requires a response of "False" to iridic cite that 
"I am a good student". If this logic is riot appropriately employed, 
respondents may give an answer which has exactly the opposite meaning 
to their- intended response. For purposes of this study a negative item 
bias is defined to be when a child responds inappropriately by saying 
"true" to a negative statement when their responses to positive items 
have consistently indicated that the opposite response would be more 
appropriate, or* vice versa. Such an effect will create d method/halo 
bias that is specific to the negative items. 
Development of the SDQ. 

the Self Description Questionnaire (SDQ) is a malt if actor seif- 
COriCept instrument for preadblescen t children. Its factor structure, 
reliability, and validity have been clearly demonstrated in numerous 
studies. The SDQ is designed to measure seven factors of se 1 f -concep t 
derived from the Shavelson model (Shavelson ft* Bolus, 1982; Shave 1 son , 
Hubner- & Stanton, 197a) arid six independent factor- analyses of 
responses by disparate groups have each identified these factors 
(I'ldfbh, Barnes, Cairns & Tidinari, iri press} Harsh, Relich & Smith* 
1933; Marsh, Smith it Barnes, 1983b). The SDQ scales are reliable 
t cbef f ic ien t alphas in the . 80 1 s and . 90 1 s ) - 9 moderatel y cbrre 1 a ted 
with measures of the corresponding academic abilities (r's from 0.3 to 
0.7 see Marsh, Parker It Smith, 19835 Harsh & Parker, iri presbj 

Marsh, Smith & Barnes, 1983a) ~ 9 iri agreement with self -concepts 
inferred by primary school teachers (see Marsh, Parker- tc Smith, 1982; 



rev . 



Marsh, Smith* Bdf ; neb & Butler, 1983) arid r eaSuri»D i y 
(Mc*rsh, Smith, Barnes tc Butler^ 1983). 

Tri the- develdpftient of the SDQ , unlike the SDQ II and the SDQ 
III which are designed for older subjects, negative t terns were found 
to be ineffective in defining the different areas of self-concept 
they were desigred to measure. Preliminary analyses indicated that 
negatively word -*d items contributed less to the internal consistency 
uf the scales* and exploratory factor analyses sometimes revealed a 
negative item factor (i,e., a factor on which only negatively Warded 
items loaded). Younger children in particular often responded -true- 
to negative items, indicating a very poor self -concept, when their 
responses to positive items consistently indicated a positive self- 
cuncept. This suggested that the problem might be a cognitive- 
developmental phenomenon. In subsequent revisions considerable care 
was taken in the wording of the negative items so that they were 
clearly negative and avoided the problem of double negative reasoning 
as much as possible. Thus, an item like -I do not 1 ike' mathematics- 
changed to -I hate mathematics. M However, numerous attempts to 
ise the negative items failed to solve the problem and led to the 
r ecommendation that these i teii^ should not be Included when scoring 
the SDQ (Marsh, Barnes, Cairns L Tidman, in press). The purposes of 
this study are to examine more carefully the effect as a cognitive- 
developmental phenomenon, and to explore its relationship to other 
theoretical and me thodo logi cal perspectives. 

Theoretic al and Methodological Perspectives ( 

A wide range of obser vat ions from disparate areas of research 
appear to be related to the negative item bias. Theoretical findings 
in developmental psychology and psychol inguistics may provide a basis 
for under standing the effect, while methodological approaches and 
findings from personality and achievement testing may provide research 
designs helpful in the study the phenomenon. A review of the relevant 
research 1 ri each of these areas is beyond the scope of this study, but 
it is important to delineate these areas. 

A Developmental /Psvchol inguist ic Perspective. Slobin (1971), and 
Klima and Bellugi-Kl ima (1971) indicate that the concept of negativity 
develops very early as is evident in primitive two-word sentences 
(e.g.* not hungry), but they point out that for complex sentences the 
negative transformation of ah affirmative sentence becomes more 
difficult since the negative element cannot just be placed at the 
start or end of the phrase. Braine and Rumairi (1983) examined 17 
inference scheiiias of reasoning and the ages at which they are 

ERLC J 



ERIC 



exhibited. The schema moat relevant to the negative item bias is 
exemplified by: it is false that there is not a "W, therefore there 
is a "W". (P. 278). Irs their review, Bralr.e and Rumairi found limited 
developmental research on this schema bat reported that 20% of 
kinderaa. triers and 90% of 10-year-olds could appropriately apply this 
type of inverse reasoning, suggesting that "cancel ing a negative 
develops in the early school years" (p. 285). Researchers h-.ve also 
identified cognitive-developmental stages in children's ability to 
apply Other forms of inverse reasoning. Attribution researchers (Run, 
1977i NichoiiMi 1978! also see Marsh, Cairns, Relich, Barnes fc Debus, 
in pr^ss) have found that children as yburig as five understand that 
ability and effort each contribute to the likelihood of success, bat 
it is not until age 10 or later that Children understand that l^ss 
effort is required to achieve success if the subject is more able. 
This research indicates that while the concept of negation develops 
very tarly, the inverse reasoning needed to correctly respond to 
negative r : < ig items probably develops during early school years. 

Personality Research. The tendency for subjects to respond to 
personality rating items independently of the content has beer, 
variously referred to as response set, response bias, response style, 
or a method/ halo effect, and different approaches emphasize the 
nonsubstantive or substantively irrelevant components of responses to 
structured items (see Wiggins, 1973 for a review). Jackson (1967; • 
Jackson & Messick, 1958, «9al) argues that content is what is left, 
over after sources of style and method have been removed through 
approaches such as regression and factor analysis. Host response 
style research considers the effects of response tendencies such as 
social desirability, where subjects attribute to themselves socially 
desirable characteristics, or acquiescence, where subjects tend to 
agree to items as self-descriptive independent of the item content. In 
a study of acquiescence, Trott and Jackson (1967) suggested that the 
influence of style increased when subjects were given ltss tlmt to 
study each item arid when each 1 teiii was more clear ly related to the 
content dimension that it was designed to measure, but that it was 
uncnr related with verbal ability for University students. While many 
possible causes of response styles have been considered , they are 
generally not considered as a cogn 1 1 1 ve-deve lopriien tal phenomenon as 
they ar t here. Nevertheless, the negatively worded i tern bias does 
qualify as a response set as conceptualized by Wiggins, ' and 
correlational approachs similar, to those described by Jackson are 

« - 6 



Negat i ve I teiiiS 4 

eiiiployed in the present investigation. 

Positive and negative items designed to measure the same 
construct are sometimes found to define two separate factors when 
exam i ned with eriip ir ical procedures such as f ac tor analysis. Nay Ibr 
r epor ted ( 1978) that responses by an i vers t ty students to a state- 
an xiety in veri tor y produced two f ac tors represeri t i rig respect i vel y 
positive and reverse scored i terns . Androgen^ research (e.g. , Spence, 
Hflinreich & Hdlahari, 1979; Anti 1 1 , Cunningham, Russel 1 & Thompson, 
1931) has found that items designed to measare masculinity and 
f fin i ri i ri i ty ac tual ly tf#f i h* four factors; uiattcul i ri i ty and ? inl ril ty 
o.re each defined by two separate factors representing pas i t i ve- oai aed 
and negative-valued items. In instances such as these, it is riot 
clear whether the difference between positive-item and negat i ve- 1 tern 
factors is substantive, nonsubs tan t i ve, or substantively irrelevant. 
Nevertheless, it seems that these examples differ from the phenomenon 
examined here in that they occur with subjects who have the cbgnitivw- 
deve ldprnen tal ability to respond appropriately to the negative items. 

Responses to a personality test, particularly by preadb lescen t 
children, may measure a different construct than that which the test 
was intended to measure. For example, Br idgernan and Shipmah (1978) 
reported that preschool self-concept was significantly correlated to 
; ar 3 scores in reading and math but riot to year 3 self -concept . 
This led the authors to speculate that the preschool measure of self- 
concept was measuring some construct besides self-concept that was, 
correlated with achievement. Even though their preschocl self-concept 
measure did not require reading, it probably required a level of 
verbal reasoning that was difficult for many preschoolers. Hence, the 
preschool responses may have had a substantial verbal component that 
biased the interpretation of self -concept, but was predictive of 
subsequent reading performance. Ironically, such a bias would 
inappropriately mate response to a self-concept instrument seem to be 
more valid when assessed against reading achievement or related 
measure's of academic ach ievemeh t . 

Achievement Testing, Achievement tests are designed to measure 
mastery of a particular body of knowledge or proficiency in specific 
skills. Cronbach (1971; 1980) and others argue that it is not only 
important that a test measure what it is supposed to measure, but also 
that it hot measure what it is hot supposed to measure. As an 
example, Cronbach (1980, p. 106) described a content specific test 
where the content "is ail too often a Sleeping Beauty screened off 
from the student by tangled clauses and thorny (bat pointless! ) 



jargon.- He sa^ests that any item that is as Highly correlated with 
scores on a reading comprehension test as the total score on the 
specific achievement test has serious invalidity for evaluating the 
content. Cror.bach (1971) describes factor analytic and correlational 
techniques for separating con ten t -spec i f i c variance from that due to 
other causes, in much the same nay as personality researchers examine 
the effect of response styles. 
The Pre sent Invest lgat ion. 

The purposes of the present research are to determine how the 
negatively worded items are related to the SDQ scales as defined by 
positively worded items, to grade level, and to reading achievement. 
Data in study 1 of the present investigation come from previous 
research designed to examine the effect of age and sex on self-concept 
(Marsh, Barries, Cairns Tidman, in press). In that study it was 
demonstrated that: 1) separate exploratory factor analyses of 
responses to the positively worded items from each of four age levels 
clearly identified the SDQ scales; 2) a linear, negative relationship 
existed between age and most of the self-concept scales; 3) student 
sex affected several scales in a manner consistent with sex 
stereotypes, but that was independent of age; and 4) the SDQ scales 
became more distinct with age. Confirmatory factor analyses, u*ihg 
LISREL, were subsequently performed, supporting conclusions i and 4 
(Marsh & Hocevar, 1984; Marsh & Shavelson, 1983). In study 1 the 
responses to negatively worded items are added to those analyzed • 
previously to examine the negative item bias and its relationship to 
age. New data are collected for study 2 where tests are made of 
confirmatory factor analytic models i r, which a negative item factor 
*as explicitly defined. Verbal ability measures are incorporated into 
these models to determine how responses to the negative items are 
related to reading ability. 

STbSY X 

i .ETHQD . 

San ^ lea ^SlA Pr deodures . Two independent samples were used in 
..Ludy 1. The first sample consisted of the 170 second grade 
(primarily seven year olds) and the 251 fifth grade children 
(primarily 10 year olds) who attended one of four public coeducational 
bLhoQls ln Sydney, Australia; Communities served by these schools 
varied in social economic status from lower and lower-middle class to 
middle and upper-middle cUss. Across all the children in this 
sample, academic abilities tended to be about average. The second 



0 

ERIC 



Negative Items 6 
sample i,i this study consisted of the 103 third grade children 
(primarily 8 year old-) and 134 fourth grade children (primarily 9 
year bids) who attended one of two public coeducational schools in 
Sydney, Australia. Neither of these schools was the same aS in sample 
1. Children in this second sample were somewhat below average in 
terms of academic ability, and tended to come from families in the 
lower, lower-middle, arid iniddie social classes. 

In study 1 the two samples are not equivalent, but the the grade 
levels within each of the samples were Selected so as to provide a 
strong control against 1 j he^f age effects being the result of 
nonequivalerit samples. Since the youngest arid oldest children in 
study 1 come from the same sample, any differences due to 
nonequivaler.t samples would produce a nonlinear age effect where the 
results for children in grades 2 and 5 would differ Systematically 
fro... those in grades 3 and 4. Thus, while this design is biased 
against the demonstration of linear age effects, it provides a 
stronger control against ifcch an effect being the result of 
r.unequivalent groups than is typically available (see Marsh, Barnes, 
Cairns S Tidmarij in press, for further discussion). 

In both samples, the SDQ was administered during a regular class 
session approximately one third of the way through the school year, 
and was the first measure to be administered as part of a more 
extensive battery of tests. The SDQ was administered by one of the 
authors of that study according to standardized procedures developed- 
in previous research. Students responded to each item along a five 
point scale which varies from "1 - False- to "a - True". The SDQ was 
read aloud to children to minimize reading difficulties, and they 
responded to several examples before any of the SDQ items were 
presented. The children were specifically instructed not to say their 
responses aloud or talk to other pupils. AS a consequence of earlier 
research the SDQ was read aloud at a fairly rapid pace, and the whole 
questionnaire required approximately 8 minutes to administer mot 
including time for the instructions and examples), though children 
were given time at the end of the administration to go back to any 
items that they had left blank. 

Analysis in this study is based upon student responses to 66 
items designed to measure seven SDQ factors. A total of 56 items 
(eight per factor) are positively worded, while the remaining id are 
negatively worded. A brief description of the seven SDQ factors is as 

follows: 

1 - 



Negative Ileitis 



, . f fix* Aii»i - Abiiities/Spprts (PHVS) — student ratings of their 
dbility and enjoyment of physical activities, sports * and games; 

• t TtT 1 " 1 Appearance (APPR) -- student ratings of their own 
Su^rl'tHi^rU^^iookf^ *P peara " c * «*P»M *ith others, and ho. 



easll y 
them as a 



l- , - . . Relationship With Peers (PEER) --student ratings of how 
fri^hd? frier,ds * Lhei ^ popularity, and whether others want 

alonci^lfh^rf V^ P Witl \ Merits — student ratings of how well they get 
hz-- > i? parents, whether parents are easy to talk to, whether 
their parents like them, and whether they like their parents. 

Reading (READ) -- student ratir, g s of their ability in dhd their 
enjoyment/ interest in reading. e r 

l'ULh*,„*Lictt /HATH) student ratings of their ability and 

e/ij U y merit/ interest in mathematics. ; 

School Subjects (|C.4L) — student ratings of their ability and 
^f.juyment/interest in "all school subjects-. y 

Statistical A^a^yais. All the statistical analyses described in 
study 1 were conducted with the commercially available SPSS program 
(Hull Nie, 1981; Nie, et al . , 1975). Before any analyses were 
performed, the responses to the negatively worded items were reflected 
su that all items varied along a scale where 1 represented the lowest 
level of self-concept and 5 the highest. Then a value of 4,0, the 
average response, was substituted for all missing responses (less than 
1/4 of 1%) . 

RESULTS ami DISCUSSION. 

The purpose of the first set of analyses is to confirm that 
negatively worded items are less consistent with other items in the 
scale they are designed to measure than are positively worded items. ' 
A series of item analyses (Hull fc Hie, 1981) were conducted for the 
total sample arid separately for each grade level. For the total 
r*aj t .p^e, the coefficient alphas for every scale arid the average 
correlation among items within each scale were higher when the 
negative items were excluded (see Table 1) . This replicates findings 
r cm earlier research. However * examination of the results for the 
iff^rent grade levels demonstrates that this effect depends upon age. 
.-r the younger children, the exclusion of the negatively worded items 
w. - .i >isteritly produces the largest improvement in the coefficient 

:;has. Also, the negative items form a scale with reasonable internal 
consistency — particularly for the- youngest pUp i 1 s . 

Insert_Tabie i About Here 
For the total sample, the Coefficient alpha for responses to the 
set of positive items is .93 and for negative items is .73. Thus, to 
the extent that the two sets of items are measuring th- -am- construct 
then they should correlate approximately ;8 or higher wi Lh each other 
(i.e., within the 1 ^^^^^j^^^j^ 



Negative Items 3 

the total sample* the cor r e 1 at ion between response* to the two sets of 
items is only ;27, Indicating that they are measuring different 
constructs (see Table 2). Furthermore, the results Illustrate a 
dramatic developmental effect. For the youngest children the two sets 
of responses are uncorrected (r = -.02) , while the correlations are 
much larger for the oldest children ( r^ .60). Thus, for the youngest 
children the negative items are measuring a construct that is 
unrelated to self -concept, while for the oldest children the negative 
i tem responses are substantially related to positive item reSpoh*** 
buL still contain considerable variance that is reliable and unique. 
These results clearly justify the decision to exclude the negatively 
wurded items in scoring the SDQ, but they also Suggest that the method 
effect is developmental ly related la the age of the subjects. 

Insert Tab 1 e 2 _ About _Here 
The self -concepts scores of preado 1 escen t children have been 
quite high in all studies with the SDQ. Consistent with those 
results, the average response to positively worded items is about *J on 
a five-response scale (i.e., a response of "Mostly True- to positive 
statements). If children are responding appropriately to the negative 
items^ then the average response to theiii should also be quite high 
(after responses to the negative items have been reflected). However > 
if some children are responding i riapprbpr i atei y by saying -true" or 
-Mostly True- to negative items when their intended meaning is the 
opposite, then the means for the negative items should be much lower 
(i.e., indicate a poorer self -concept) and the standard deviations 
h igher . 

For positively worded items the mean response across all scales 
^hoinrs a consistent arid marked decline with age (r « -..20; p < .001). 



U.r-sh, Barries, Cairns and Tidman (In press) demonstrated that this 
fc.ct Is consistent across most of the SDQ scales and is primarily a 



.near fcr i 



? **ecl. i" marked contrast; the average response to the 
Actively worded items shows a marked incr ease with age Cr - .23, 
.j \ .001). For the youngest children, responses to the negatively 
warded items are much lower than to the positive items (see Table 2). 
XL is only for the oldest children that the mean response to positive 
^ id negative items is approximately the same. This suggests that some 
children are inappropriately giving responses of J True" ^nd -Mostly 
True- to negative items when in fact they have positive self -concepts. 
Also consistent with this conclusion a^e the larger standard 
^devi^it iiJris for responses to negative items by younger children, 

ERIC 



Negat i ve I terns 



9 



suggesting that some children are responding inappropriately while* 
others are riot. 

In summary, some children at each grade level seem to respond 
i riappropr 1 atel y to negatively worded items. The pheribiiieridri is clearly 
age related and occurs more frequently with younger children. Since 
this bias is systematic rather than constant or random, it is 
par 1 1 cul ar 1 y serious. These findings support the decision not to 
include responses from negatively worded items in the scores derived 
from the SDQ^ but they also have important implications for other 
rating scales designed for use by children and for the further »tudy 
of this effect as a cognitive-developmental phenomenon. 

STUDY 2 

The results of study 1 show that responses to negatively warded 
items are influenced by a method/halo effect and that this effect 
varies with age. the negative items apparently require a higher level 
of verbal reasoning in order to respond appropriately, and this is why 
the effect Is larger -for the younger- children. Despite the intuitive 
appeal of this explanation, study 1 suffers important weaknesses which 
limit the strength of the conclusions. The use of exploratory factor 
analyses in the original research (Marsh, Barnes, Cairns Sc Tidman, in 
press) precluded a test of whether* negatively worded items contributed 
Lt_i a "negative item factor", to the appropriate scale which the item 
was designed to measure, cir to both. The suggestion that the negative 
Item bias is systematically related to verbal reasoning or reading 
ability could hot be tested directly, since reading scores were riot 
available. Instead, this inference was based upon the finding that 
the negative item effect varied for different age groups arid that the 
yuariger children have poorer verbal skills. 



The purpose of study 2 is to further examine these Issues with 
; . reduces which overcome the weaknesses. ft new sample of fifth graue 

v i 1 a completed the SDQ and two verbal achievement tests, arid were 
. . , ud in terms of their reading ability by their* teachers. Results of 
i. L jdy i showed that the negative item bias was weaker for fifth grade 
:.L aJenlb compared with younger children, but it was still evident. 
-J, study 2 confirmatory factor analytic (CFA) models were tested 
r. hich required that negative items load on the factor which they were 
J signed to measure, on a Separate negative item factor ^ or bh both. 
Trie verbal ability measures were also incorporated into these models 
in such a way that the relationship between the negative item bias arid 
serDal ability could be tested. Since students from only one grade 






Negative Items id 
effect of reading achievement must be relatively independent of age. 

Sample Srid Procedures. Pupils In study 2 were a new sample of 
559 fifth grade students (mostly ±0 year olds) enrolled In 19 fifth 
grade classes tn one of seven private Catholic schools in Sydney, 
Australia. None of these schools were the same as employed In study 
1. Most of the students attended single-sex classes (18 of the 19 
classes) Children in the sample came from families which varied In 
^cJcioecohcmic status from lower-middle to upper-middle ci«»s». Across 
all the children in study 2 the academic abilities were about average. 
Data considered in study 2 are part of a larger project which is 
described in more detail by Marsh, Smith h Barnes (1983a). For 
purposes of this analysis, consideration is limited to pupil responses 
Lu the SDQ* results from two verbal ability tests, ah<i teacher ratings 
of each pupil's reading ability. 

The SDQ was administered in the same manner as described in study 
1, but a slightly revised version of the SDQ was employed in study 2. 
This version of the SDQ contained 76 items — the additional 10 items 
were designed to measure general self-concept or self -esteem. Thus, 
the current version of the SDQ is designed to measure eight factors 
the seven described earlier and a general-self scale. Of the 76 
items, a total of 12 were negatively worded (two for each of the three 
academic scales and the general self scale, and one each for the four 
icmacademic scales) . 

The two achievement tests of verbal ability were the 
Comprehension test and the Ward Knowledge test of the Primary Reading 
Survey Tests (ACER, 1976). The Word Knowledge test consists of 40 
jilLiple choice synonym items and takes 20 minutes to answer. The 
^ i^ir f he/ision test consists of 34 multiple-choice items and takes 30 
Ir.ules. In addition, teachers were asked to judge the reading 
iLlilty of each child along a scale that varied from "± - Very Poor" 
j -9 - Very Good", thus providing a third measure of reading ability. 
The achievement tests were distributed to the schools by the 
•c- sear-H hers, but were actually administered by the classroom teaciiers 
!uf ing a regular class session before the administration of the SDQ. 
"htf tests were then scored by the researchers with the understanding 
hat feedback would be given to the schools after completion of the 
tudy. Two of the schools declined to participate in the achievement 
est ing, though they did agree to the administration of the SDQ arid to 
O plete teacher ratings. T..e SDQ was administered along with other 

ERIC 



Negative Items 11 
materials during a regularly scheduled class while teachers were asked 
to complete a teacher rating form for each child. Some teachers did 
not actually complete the forms until later, and one teacher 
eventually declined to complete the forms at all. 

Statistical Analyses. the CFA in study 2 were performed wi th the 
commercial iy available LISREL V? program ( Joresfcog & Sorbom, 1981). 
With tISREt V the researcher is able to define alternative factor 
solutions designed to test different hypotheses, and to compare the 
ability of competing models to fit the original data (see Jbreskog & 
Scir bom, 1981; Long, 1983). The LISREL V? program, after testing for 
identification, attempts to minimize a maximum likelihood function 
which is based upon differences between the original arid reproduced 
covariance matrix, and provides an overall chi -square goodness-of -f i t 
test (Jbreskog & Sorbom, 1981; Maruyama & McGarvey, 1980). For large 
complex problems with large sample sizes, the observed chi-square will 
nearly always be statistically significant, arid alternative 
indications of goodness-o * -f i t are required, the most commonly used 
alternative Is the ratio of the chi -square to the degrees-of -freedom 
(df) in the model. However, this value is still directly related to 
Lfie sample size such that the same solution will lead to a iiiuch larger 
ratio when based upon more cases. Other indices have been developed 
which are hot affected by sample size. LISREL U presents the root 
mean square residual (RMS) which is based upon the residual . 
covariances the difference between the original and reproduced 

cor relations in this example. Bentler 6c Bonne tL (1930) developed an 
index called coefficient d which scales the observed chi-square along 
a bcale which varies from Zero to 1.0. The zero paint represents the 
: ; .:-^quare obtained from a null model (normally one which results in a 

; . Ldnced covariance matrix which is diagonal) and 1.0 represents an 
: ;; fit. Thus, it is like ah estimate of the variance which can be 

..IdLiried by a given model. 

in preliminary analyses^ the factor structure underlying the a4 
. . ,i Lively worded items from the SDQ (i.ei, 8 items from each of eight 
bl ..j tt i) w as examined. For purposes of this and subsequent analyses, 
e_wh scale was defined by four variables representing the lotal 
response to a pair of items. Within each scale, the first two items 
which were positively worded defined the first item pair, the next two 
the second item pair, and so forth; This is the same procedure used by 
i'Urshi Barnes, Cairns and Tidman tiri press) and other SDQ research 
' ^ws e Harsh O'Niell, in press, for further discussion). In the next 
E P^f loa of analyses the 12 negatively worded item* w^-e included, 



only one negative iteiri. A series of different CFA models were defined 
in which each negative item was required to load only on the -Factor 
which it was designed to measure, only on a ninth, "negative item" 
-factory or on both- The ability of each of these models to fit the 
data was tested. In the filial set of analyses the three reading 
biiores were used to define a reading ability factor, and this factoi- 
ds related to the self-concept factors and to the negative item 
f dC tor • 

All the analyses were based Upon a Ay x *? correlation matrix 
s epresent ing tFie 32 positively worded item pairs, the 12 negatively 
Curded iteios, and the three reading scores. For the self -concept 
responses there was almost no miss ing data (less than i/lO of 1% of 
the responses) and the mean response was substituted for the few 
(Hissing values. However, for the teacher ratings of reading ability 
Lher^ were 3£ missing values (o%> , representing primarily students 
from cine class where the teacher did not complete the ratings, and 142 
;.iissing values 125%) for the reading tests, representing primarily 
students from two schools which did not administer the achievement 
tests. For Purposes of this study pair-wise deletion of missing data 
^j^ls usee in the determination of the correlation matrix (see Nie, et 
al., 1976) . However, a similar correlation matrix based upon only 
those cases which had no missing data for the three reading measures 
was virtually the same as the one which was actually used. Thus, 
while the large riUniber of missing values for the reading scores do**s 
require that the results be interpreted Cautiously, it is unlikely to 
;.e±\/e any substantial effect. 
RE SULTS. 

£pA o± the Posl tlvelv worded I tern Pairs. In CFA (confirmatory 
; t ^tcir analysis) alternative models are specified by fixing or 
i .^training elements in three matrices which are conceptually similar 
Ln matrices resulting from common factor analysis. These are: 

1 ) LAMBDA Y , a hi a t r i x Of fac tor 1 oad i rigs » 

2) PSIi a factor correlation Jalrix which represents the 
relationships among the factors; and 

3) THETA EPSlLON,_a_diagbhal matrix Of error /an i queness terms 
that are conceptually similar. to one minus the communal i ty estimates 
in exploratory factor analysis* 

The results of the CFA (see Table 3) illustrate the pattern of 

parameters to be estimated in these three matrices, using only the 

positive item pairs. All coefficients with a value of "b" or VI" are 

,e,. p redetermined) ar id not estimated as part of the analysis, 



A5~ 



ERIC 



while other parameters are free and estimated in the analysis- For 
this problem 32 measured variables are ased to define eight -factors, 
the free parameters consist of 32 factor loadings in LAMBDA Y, the 28 
correlations among the eight factors in PS£; and the 32 

error /uniquenesses in THETA. This factor pattern is very restrictive 
in that it allows each variable to load on one and only one factor, 
and represents an ideal of "simple structure- - The parameter 
estimates (see table 3) indicate that each of the eight self-concept 
factors is well-defined. The gdbdriess-df -f i t indices (see Table 4) 
indicate that the model adequately explains the data. Despite the 
large sample size, the ch i -square /df ratio is drily slightly larger 
than 2, while the values for RMS and coefficient d each indicate that 
the fit is good. 



CPA of Positive & Ne<3atHrAre I terns - In the second set of CPA 
models, the 12 negatively worded items are added to the variables 
shown in Table 3. In models 2.1 - 2.3 each negative item is required 
to load only on the self -concept factor that is was designed to 
measure (model 2.1), or only on a ninth, negative item factor (model 
2.2), or oh both the self-concept factor and the negative item factor 
(model 2.3 — see table 5) i Inspection of the goodness-of -f 1 t indices 
(see Table ^) indicates that model 2.3 provides the best fit to the 
data. Thus , variance in responses to the negative items represents 
both the factors which the items were designed t.o measure arid a 
method/halo effect. 



the parameter estimates for model 2.3 (see Table 5) indicate that 
the inclusion of the negative items has virtually rid effect dri the 
pcfciiiieter estimates for the positively worded item pairs; the 

diueter estimates in Table 5 are nearly the same as in Table 3. 
''actor loadings on the self -concept factors are smaller for the 
riJ^atively worded items tiiah for the positively worded items, but all 
-if the loadings are statistically significant (p < .01). The loadings 
for the negative items are somewhat smaller dri the negative item 
factor- than for the self -concept factors, but ii of the 12 loadings on 
tire negative item factor are also statistically significant (p < .01). 
Correlations between the negative item factor and the self-concept 
factors (in the PS1 matrix) are all close to zero arid drily the 
correlation with Reading self-concept (r - .13) reaches s i gh if i cance 
at the .01 level I the correlation with Hath sel f -.cqricep t reaches 



insert Tables 3 fc 4 About Here 



Insert Table 5 About Here 




ERLC 



Negat t ve I terns in 

significance at p \ .OS). This demonstrates that trie negative i teiii 
bias is nearly u'ricor r e 1 a ted with any of the self -concept scales. 

Model 2.4 differs from 2.3 only in that the 8 correlations (in 
the PSI matrix) between the negative item factor and the self -concept . 
factors were fixed Lb be zero. Inspection of the gdbdriess-of -f i t 
indices demonstrates that this model fits the data nearly as well as 
model 2.3 (in which each of these correlations were estimated but were 
observed to be close to zero). Despite the large sample size the 
difference in chi -squares between models 2.3 and 2.4 fails to reach 
statistical significance at p < -0±, though it Is significant .it 
p \ .03 (chi-squLare difference ^ 16, df — 8, p < .05). 

CFA With Read i rig Measures. In the third set of analyses* the 
three reading measures were added to the variables described in models 
2.0 - 2.4. In each instance the three reading measures were used Lb 
define an additional factor ca* led reading ability. The three reading 
measures were free to load on this additional factor-, but not on any 
other factors. Again* model 3.3* where the negative items were 
allowed to load oh both the self -cuncep t factors and the negative item 
factor, was able to explain the data substantially better than models 
in which the negative items loaded only on the self -concept factors or 
only on ,the negative item factor. Also* model 3.4, where cor relations 
between the negative item factor arid the self-concept factors were 
fixed to be zerb* was nearly indistinguishable from model 3.3. 

Inspection of the parameter estimates for model 3;3 (see Table' 6) 
shows that for the self -concept variables — both the positively and 
negatively worded items -- the estimates are nearly the same as for 
model 2.3. The Reading ability factor is well defined in that each of 
the three variables designed to define it loads substantially on that 
factor. The Reading Ability factor correlates substantially with 
Reading self -concept (r - .43), bat not with any of the other self- 
concept factors. The Reading Ability factor- is also subs tan t i c*l 1 y 
correlated with the Negative I teiii factor (r - .42) • 

The correlations between the Reading Ability factor and the Other 

factors in model 3.3 are particularly important for this study. The 

negative item factor represents a method/ halo bias* and these results 

show that this bias is substantially correlated with reading ability. 

Children with poorer reading skills are more likely to respond "True" 

to negatively worded items rather than to respond in a manner 

consistent with their responses to positive 1 teiiis. The finding that 

O . _ - ------ 

ERylC reading ability is only correlated with Reading self -concept, bat not 

Miith other self -concept factors, farther demonstrates the ^ J*y 



Negat 1 ve I teius 



15 



distinctiveness of the different self-concept factors; In summary 
these findings demonstrate that negative i tehis contribute 
significant ly to both the scale they were designed to measure and a to 
negative i tens bias. The negative item bias is nearly uricdr re 1 a ted 
wi th the self -concept factors but is substantial ly correlated wi th 
read i ng ach ievemen t . 



In each stady, the result* .suggest that negatively worded items 
c±re often responded to ihappr riately by preauol escen t children; 
When forced to use the more difficult reasoning required by the 
negatively worded items, children often respond "True" or "Mostly 
True", implying a poor self concept, even though their responses to 
positively worded items indicate that they have favorable self- 
concepts (see footnote 1). This phenomenon is more likely to occur 
for younger children arid for children with poorer reading ability. 
Since most children have high se If -cdricep ts (i.e., the average 
response is 4 on a 5 point response scale), children who are younger 
and /or who have poorer reading skills will inappropriately appear to 
have systematically lower se 1 f -cbncep ts than other children merely as 
an artifact of the negative I tern bias. The demonstration of the 
Substantial correlation between reading achievement and the negative 
x teiii bias in a single year group indicates that the effect of reading 
on the bias is relatively independent of age; The negative item effect 
will bias interpretations of self "Concept scores so that they - % 
erroneously appear to be more highly correlated to reading achievement 
and other academic achievement scores that are frequently used to 
validate self -concept measures, and so that comparisons across age 
«j roups are invalid. 

While the results of these two studies clearly justify the 
cietiisiori to exclude responses from the negatively worded items when 
coring the SDQ^ several features of the present investigation may 
limit trie general izabi 1 i ty of the conclusions. Trott arid Jackson 
;i967) found that a method effect varied with the amount of time 
subjects had to study each i teiii, and so if the SDQ items were 
presented at a slower pace the negative item bias might be smaller. 
Furthermore, the complications involved in us i rig a five-point response 
scale may have exacerbated the negative item bias. However, Marsh and 
Smith (1982) identified a substantial negative item factor in 
responses by fifth and sixth grade students to the Goopersmith Self 
Esteem Instrument. Oh the Cooper smi th , half the items are negatively 
worded, subjects respond to each item with either a "Like me" or "Not 



DISCUSSI ON OVERVIEW 




Negative Items 16 

tike Me" response , and students were given longer to respond to each 
Item. Hence, the results from the Cooper smith instrument indicate 
that the negative item effect may generalize to instruments in which a 
1 anger proper t ion of the i terns are hegat i ve 1 y worded - 7 the response 
scale has only two categories, and the pace of item presentation is 
slower. 

i 

This investigation is based on responses to a self-concept 
instrument, but it is likely that a similar phenomenon occurs with 
ether rating instruments as well. The double negative logic required 
to answer- negative items appropriately is not limited to self -concept 
items, and tne negative items on the SDQ were more carefully 
constructed to avoid this problem than is typically the case with 
other rating scales. Also, while the findings of this study are 
limited to the responses of preadbiesceh t children* a similar 
phenomenon may occur with the responses of older subjects. Thus, LHt? 
type of analysis described here particularly the CFA with the 
inclusion of verbal ability measures — provides a model for other- 
studies to examine the operation of such halo/method effects. 

The focus of this study has been on the effect of negatively 
worded items as a bias to rating instruments Lhat are used by 
prt^dolescent children. However, the con terit ion that the effect is a 
cogn i t 1 ve-deve lop men tal phenomenon was strongly supported, and further 
research into the substantive aspects of this effect should prove i 
valuable. The results of study 1 snow that there is a dramatic 
developmental shift during early school years in the ability of 
pr-eado lescen t children to respond appropriately to this type of rating 
item. These results correspond with the conclusion by Bralne and 
Rumairi (1983) about the age at which children can appropriately use 
inference SChenias of reasoning that require double negative logic. 
The results of study 2 show that within a single grade level, there 
►•♦ere substantial individual differences in the size of the effect and 
these are related to verbal achievements Hence, the substantial 
effect of verbal achievement in study 2 is relatively independent of 
a^e , even though the age ef f ec t in study 1 was confounded by 
differences in verbal achievement. Further research is clearly needed 
Lu relate this cognitive-developmental effect to cognitive stages of 
early development considered in other research. 



Negative I terns 17 



FOOTNOTES 

P'^^P 1 :eswn t ch i ldren topical _ use . the._most_f ayorab le . three? 
response categories _ when . respond i 09. to self-concept items on a five- 
point response scale, indicating positive - self-canceptsi This makes 
i t_ relatively easy to recognize WHeri children with the most favorable 
s^f -concepts are responding inappropriately tonegat 1 vel y worded 
items * _ si rice the appropr iate_and_ inappropriate responses are at 
opposite ends of the response scale. However, It israore difficult to 
recognize when ch i Idrenwi th the least favorable self-concepts are 
resporidi^ng inappropriately, since both appropr i ate and inappropriate 
responses would tend to be _ near _ the_ in ii dd 1 e _of the. response scale. 
Consequently, attempts to estimate the frequency of acarrencC of the 
negat i ve i tern bi as are likely to Underestimate its ac tual ac.c.\irr&nc.&. 

REFERENCES 

Aritill, J. K., Cunningham, J. D. , Russell, G. , Thompson, N . L. 
(1981). An Australian sex-role scale. Austral ian Journal of 
Psychology, 55, io9-iS3. • 

Australian Council for Educational Research (ACER). (1976). Primary 

Reading Survey Tests A - Hawthorn, Victoria, Australia^ ACER . 

Bentler, P. W. & Bonett, D • G; (±980). Significance tests and goodness 
of fit in the analysis of co variance structures. Ps-yefro^b^eal 
Bui let in, 88. 588-60s. 

Brldgemari, B. 8c Shipmah, V. C. (1978). Preschool measures of self- 
esteem and achievement motivation as predictors of third-grade 
achievement. Journa l 0+ E duc at i ona x Psycho l ogy ^ ±7-28. 

Cronbach, t. 3 . (±980). Validity on parole: How can we go straight? In 
W. B. Schrader (Ed.), Measuring Achievement: Progres s Over a 

Decade (p. 99-±08) . Nert directions for testing and tn e a s u r • eiri e h t . 

_____ » 

San Francisco: Jossey-Bass- 

Cronbach, L. J. (1971). Test Validation. In R. L. Thorndike (Ed. ) * 
Educational Measurement (2nd Ed., p. 443-507). Washington E.G.: 
American Council on Education. 

Hull, C. H. & Nie, N . H. (1981). SPSS Update 7 - 9_^ New York, McGraw- 
Hill. 

TcicUiadn, I). N. (1967). Acquiescence response styles: Problems ir 

identification arid control. In I. A. Berg (Ed.), Res pons e se4r ±n 
rsonal 1 1 y assessment (p. 71-111) . Chicago: Aldlne. 
Jackson, D • N. Ec Messictc, S. (1958). Content drid style In personality 

dssessinen t . Psychological Bulletin. 55, 243-252. 
Jdckson, D. N. fie Messick, S. (1961). Acquiescence and desirability as 

response determinants on the MMPX . Educational and Psycho log teal 

Measurement* 21 . 771-790. 
Joreskog, K. G. & Sorbom, D. (1981). LISREL \J: Anal ysis of Linear 

Structural R e latio ns By ±*r& hie t hod g± Maximum LI kel ih ood^ 

Chicago: International Educational Services. 

ER?C 



Klima, E. S. & Sel lugl^Kl liiia* U. (1971). Syntactic regularities in the 
speech af children; in A. Bsr-Adon £ W. F. Leopold, C hi id 
Uf^uaqe: A book a± r e adings (p. 412-424). Engiewobd Cliffs* NJ: 
Prent ice-Hal 1 . 

Kuri, A. (1977). De v* 1 opniwri L Of the magri i tude-cbvar i at i un and 
compensatory schema ifi ability and effort attributions cif 
performance; Child Developments 48, 862-873. 

Long, J. S. (1983) 4 Cbrif 1 rniktor y Factor Analysis. Beverly Hills: Gac«?. 

Marsh, H. W. , Barriv*, J,, C&lrhft, L. & Tidman, M. (in press), the ielf 
Description Quest ionnaire (SDQ): Age effects iri the structure arid 
level of self -concept for pre*do lescen t children. Journal of 
EdQcat ional Psychology, (in press). 

Marsh, H. W. , Cairns, L. , Rellch, J., Barnes, J. , & Debus, R. k. 

(1984). The relationship between dimensions of se If -at tr Ibu t i on 
and dimensions of se If -concep t . Journal o± Educ ate anal Psychology . 
( in press) . 

Marsh, H. W. fc Hocevar, D. (1984). The Application of confirmatory 
factor analysis to the study of self -concept : First arid Higher 
order factor Structures arid their i nvar 1 ance across age groups. 
School of Education, University of Southern California, (in review) 

Marsh, H. W. & Parker, J. W. (in press). Determinants of student self- 
concept: is it better to be a relatively large fish in a small 
porid even if you don't learn to swim as well. Journa l tt± - 

Personal 1 t y and Soc 1 a 1 Psychology , 
________ _____ _ .. > 

Marsh, H. W. , Parker, J. W. & Smith, I. D. (1983). Preado lescen t 

self -concept: Its relation to self -concep t as inferred by teachers 

and to academic ability. British Journa l of Educational 

Psychology. 53. 60-78. 
Marsh, H. W. , Relich, J. D. & Smith, I. D. (1983). Self -concept : The 

construct validity of interpretations based upon the SDQ. Journal 

uf Personal i ty arid Soc __t± Psycholog y , 1983* 45. 173-187. 
Harsh, H. W. L Smith, I. D. (1982). Mu 1 1 i t rai t-mul t irne thod analyses of 

two self -concep t instruments. Journal of Educational Psychology, 

74, 430-440. 

Marsh, H. W., Smith, I. D. Jk Barnes, J. (1983a). Mul t id imerisiorial 
Self -concepts: Relationships with sex and academic achievement. 
Department of Education, University of Sydney, (in review). 

Marsh, H. W. , Smith, ii D, h Barnes, J. (1983b). Wultitralt- 

iiiul 1 1 met hod analyses of the Self Description Questionnaire: 
Student- teacher agreement on multidimensional ratings of student 
self -concept. A m e rica n Educationai Researc h Journal , 20 . 333-337; 



Marsh, H. W. , Smith, i. D . , Barnes, J . fe Butler^ Si Sel f -coricep t : 
Reliability, d imeris i ohal i ty , validity, and the rneasuremen L of 
change. Journai df; Educational Psycholrigy - % 75; 772-790. 

Martin, B. S. h Ramaln, B. (1983). Logical Reasoning- In J- H. Flavell 
& E. •'!. Narkman (Volume Eds. ) - 9 P. H. Mussen (Ed.), Handbook of 
ch 1 Id psychology; Cognitive development (Vol. Ill, p. 263-340). 
New York: Wi ley. 

ilaruyama, G. h McGarvey, B. (1980). Evaluating causal models: An 
application of maximum likelihood analysis of structural 
equStlQhi. Pfiychbldglca^ B ui l^t in^ Q2 , 502-512. 

Nay lor, F . D. (1978). Success and failure experiences an<! the factor 
structure of tr* State-Trait Anxiety Inventory. AusLr al +atrt 
Jour hal o£ PsyChblbgy. 3Q^ 217-226. 

Ni chol Is, J. (±978). The development of the concept of effort and 

ability, perceptions of academic attainment, and the understanding 
that difficult tasks require more ability. Child Beve^^Qginenty 49 j 
800-814. 

Nie, N. H. , Hull, C. H. , Jenkins, J. G • SLei nbrenner , K . b Bent* 

D. H. (1973). Stat 1st icai Package for the Social Sc lerices. New 

York: McGraw-Hill. 
Shavelsdn^ R. J. & Bolus, R. (1982) . Self -concept : The Interplay ,of 

theory and methods. Jour rial of Educational Psychology. 74^ 3-17. 
Shavelson* R. J.* Hubner, J. J. Stanton, G; e. (1976). Val Idat ion of 

construct Interpretations. Review of Educationa l Research* 46* » ' 

407-441. 

Spence, J. T. , Heliiireich, R. L. , & Holahan* C. K. (1979). Negative arid 
positive components of psychological masculinity and femininity 
arid their relationships to self-reports of neurotic and acting out 
behaviors. Jour rial of Personal i ty and Social Psycholocjy T 37« 1673- 
1682. 

Trutt, D. M. L Jackson, D - M • (1967); An experimental analysis Of 

acquiescence. Jour rial of Expe rim e ntal R e search -H\ Personal lty t 2, 
278-288. 

Wiggins, J. S. (1973). Pe^sahallty and pr e diction ^ Pr ; i nc Idles of 
personal i ty assessment. Menlo Park, CA : Add i SOn-Wes 1 ey . 

Wylie, R. C. The self -concede iRev. ed. , Vol. 1). (1974). Lincoln: 
University of Nebraska Press. 

Wylle^ R. C. The self-concept (Vol. 2). (1979). Lincoln: University of 



Negat i ve I terns *0 



TABtE 1 

Cbe-f -f icient Alphas and Average Item Intercorrelat ions for Scales 
With and Without Negatively Worded items 



Grade 2 Grade 3 Grade 4 Grade 5 Total 

Scale With Without With Without With Without With Without With Without 



PHYS 

alpha .66 

avg r .22 

APPR 

alpha . 83 

avg r . 36 

PEER 

alpha .74 

avg r . 28 

PRNT 

alpha .64 

avg r .24 

X2*L§1 Non- 

Academ i c 

alpha ,88 

avg r : 20 



READ 
alpha 
avg r 

MATH 
alpha 
avg r 

SCHL 
alpha 
avg r 

lotal 

Ac ad em ic 
alpha 
avg r 

Jgtal 

alpha 
avg r 



.75 
. 26 

, 79 
, 30 

76 
25 



90 
24 



93 
19 



Negat i ve 
T terns 

. alpha . 73 
avg r ; 26 



. 78 
. 31 

. 85 
.41 

. 83 
38 

. 80 
33 



, 92 
27 



.84 
. 39 

, 82 
. 46 

82 
36 



93 
33 



. 95 
26 



65 .71 
19 ,23 



, 80 
32 



. 84 
13 



. 83 
. 33 

, 83 
32 



92 
27 



81 
35 



.70 .72 

.21 .24 

58 .66 

16 .20 



5 



.74 .74 
23 .26 



.87 
.35 

,83 
38 



92 
32 



91 . 92 
15 . 17 



65 
16 



, 76 
.26 

- 89 
. 48 

.84 
. 38 

, 69 
. 24 



, 91 
,23 



. 80 
.32 

. 90 
. 52 

.87 
. 45 

. 77 
. 29 



. 92 
. 26 



.86 .86 

.40 .41 

.90 .90 

.47 .54 

,83 .83 

34 .38 



91 .90 
26 .28 



94 .94 
18 . 21 



.75 
.26 

. 88 
. 46 

. 80 
. 32 

.76 
. 29 



67 
17 



93 
33 



93 
17 



63 
15 



. 78 
. 30 

. 87 
. 46 

.81 
. 45 

79 
33 



89 . 88 
18 .20 



.90 .90 

. 49 . 53 

.90 .91 

. 49 . 55 

,85 .85 

36 .41 



, 93 
,35 



93 
19 



.69 .77 
.23 .30 



.86 
. 42 

. 78 
. 30 

. 69 
.24 



. 88 
18 



. 87 
. 45 

.82 
, 36 

, 76 
30 



, 90 
22 



.84 .86 

.36 .44 

.86 .89. 

-40 .51 
i 

.8; .84 

.31 .39 



, 91 . 92 

27 . 34 



93 . 93 
17 ,18 



. 73 
21 



c><L 



Negative Items *l 

TABLE 2 

Means and Standard Deviations For Positively and Negatively 
Worded Items, and Correlations Between the Two Sets of Items 



or ade 
Leve 1 



3 
A 
5 

Tbtai 



Positively Worded 
Items <N = 56 Items) 
Mean SD 



4. 24 
4. 12 
4. 03 
3-97 
4. C2 



0. 69 
0. 54 
0. 57 
0.56 
0.59 



Negat i vel y Worded 
Items <N = 10 Items) 
Mean SD 



3. 43 
3 • 88 
3. 91 
4.00 
3. 84 



1 . 10 
0. 75 
0. 67 
0. 63 
0. 77 



Correlation 
Between Two 
Item Sets 

- -02 

+ -42 * 

+ -60 * 

+ -59 * 

+ .27 * 



* p < .01 



Hf^dLt i I Lt?iiiS 



TABLE 3 

LISREL Max! muni Likelihood Estimates For Parameters in 
Model 1.1: 8 self-concept Factors (positively worded Items only) 

Factor Loading Matrix ILANBBAJ 

_ Uniqueness/ 

PHVS APPR PEER PRNT READ MATH SCHL GENt error 

Variables 



PhySl 


79* 


b 


b 


b 


0 


0 


0 


O 


37* 


PhyS2 


84* 


0 


0 


0 


0 


0 


0 


0 


30* 


Phys3 


81* 


0 


0 


0 


0 


0 


0 


0 


*?* 




80* 


0 


o 


0 


0 


b 


b 


0 


35* 


Apprvl 


Q 


70* 


0 


0 


-- 
0 


_._ 
o 


— 

0 


- 
0 




52* 


Appr2 


0 


63* 


0 


0 


0 


0 


o 


o 


61* 


Appr3 


0 


85* 


0 


0 


b 


0 


0 


0 


28* 


A^ pr-4 


0 


81* 


b 


0 


0 


0 


a 


o 


34* 


peer jl 


r\ 
O 


U 


- - - 

76* 


- 

o 


0 


0 


0 


0 


42* 


Peer2 


0 


0 


78* 


0 


o 


0 


b 


b 


39* 


Peer3 


0 


0 


75* 


b 


b 


o 


0 


o 


44* 


Peer 4 


U 


0 


82* 


0 


0 


0 


0 


0 


32* 


rr ii 1 1 


U 


- 

o 


0 


48* 


0 


b 


b 


0 


77* 


Prnt2 


0 


0 


0 


52* 


0 


0 


o 


o 


72* 


Prht3 


0 


0 


Q 


81* 


0 


0 


0 


0 


35* 


rrfl L4 


KJ 


u 


0 


84* 


0 


0 


0 


0 


3d* 






U 


- - 
0 


0 


_ _ _ _ 
86* 


0 


0 


0 


_ _ _ 
26* 


Rea<i2 


0 


0 


0 


0 


87* 


0 


0 


0 


24* 


Read3 


0 


0 


0 


0 


84* 


0 


0 


0 


29* 


ixeau *-f 


0 


0 


0 


b 


84* 


b 


0 


b 


30* 


Ma t h 1 


0 


0 


0 


0 


0 


84* 


A 

u 


U 


JUt 


Math2 


0 


0 


0 


0 


0 


88* 


0 


0 


23* 


Hath3 


0 


0 


b 


b 


b 


89* 


0 


b 


22* 


Hc±th4 


0 


b 


b 


0 


0 


91* 


0 


0 


16* 


Schli 


0 


0 


0 


0 


0 


0 


78* 


b 


40* 


Sen 12 


0 


0 


0 


b 


b 


b 


66* 


0 


56* 


Schl3 


d 


b 


0 


o 


0 


0 


78* 


0 


39* 


Schl4 


0 


0 


0 


0 


0 


0 


85* 


0 


28* 


Geri 11 


0 


0 


0 


b 


0 


0 


0 


66* 


57* 


Genl2 


b 


0 


0 


0 


0 


0 


0 


75* 


44* 


Gen 13 


o 


0 


0 


0 


O 


0 


0 


80* 


36* 


Geri 1 4 


0 


0 


0 


b 


b 


b 


0 


7i* 


49* 




Correlations 


-fiiiiorig-Fac tors 


4PSI) 








PHVS 


APPR 


PEER 


PRNT 


READ 


IMATH 


SCHL 


GENL 




Factors 





















PHVS 


1_ 














APPR 


42* 


1 












PEER 


66* 


49* 


1 










PRNT 


4o* 


27* 


47* 


1 








READ 


16* 


08 


17* 


06 


1 






MATH 


28* 


21* 


26* 


24* 


13* 


1 




SCHL 


40* 


26* 


34* 


31* 


43* 


74* 


± 


GENL 


73* 


49* 


80* 


53* 


27* 


40* 


54* 



* p < .01 

Note: Parameters with Values of b and 1 were fixed and not estimated 
as part of the analysis. The four measured variables designed to 
measure each factor are the sum of responses Ld pairs of positively 
worded items. 



Neya t i we I terns 23 

TABtE 4 

Summaries of Goodness of Fit Indices for the CFfi Models Containing 
Self -concept (SC ) * Negati ve I tem^ and Read i rig Abi 1 i t> f ac tors 



Model Description chi- df chi-sq/ RMS Cbeff 

square df rat id d 



1) Positive Items Only 

1.0 Null Model ±1,263 496 22; 70 .303 .00 

±.i Full Model 1,020 436 2.34 * .044 .91 

(see Table 3) 

2) Positive & Negative Items 

2.0 Null Model 14,163 946 14.97 .263 .00 

2.1 8 SC factors with Neg 

items on SC factors only 2,250 874 2.57 .056 -84 

2.2 8 SC factors fc_± Neg _ item 

factor with Neg items on _ _.. 

Neg item factor only 2,808 866 3.24 .077 .80 

2.3 8 SC_f actors &c_l neg_ileiii 
factor with neg items on 

Doth SC & neg item factors 1,822 854 2.13 .046 .87 

(see Table 5) 

2.4 Same as 2^3 with corr's 
between SC factors & neg 

item factor set to 0 1*838 862 2.13 ; 048 ;87 

3) Positive & Negative Items, 
and Reading Ability Factor 

3.0 Null Model ::5*4±5 ±081 ±4.26 .253 ' .00 

3.1 Model 2.1 with reading < 
ability factor 2,715 998 2.72 ; 062 .82 

3.2 Model 2.2 with reading 

ability factor 3*224 989 3.26 .077 .79 

3.3 Model 2. 3 with reading ._ _ .. 

ability factor 2,234 977 2.29 .050 .86 
(see Table 6) 

3.4 Model 2.4 plus reading _ __ _ 

ability factor 2,251 985 2.29 .052 .85 



f 



